Mining for Mutually Exclusive Items in Transaction Databases
نویسندگان
چکیده
Association rule mining is a popular task that involves the discovery of co-occurences of items in transaction databases. Several extensions of the traditional association rule mining model have been proposed so far; however, the problem of mining for mutually exclusive items has not been directly tackled yet. Such information could be useful in various cases (e.g., when the expression of a gene excludes the expression of another), or it can be used as a serious hint in order to reveal inherent taxonomical information. In this article, we address the problem of mining pairs of items, such that the presence of one excludes the other. First, we provide a concise review of the literature, then we define this problem, we propose a probability-based evaluation metric, and finally a mining algorithm that we test on transaction data. IntroductIon Association rules are expressions that describe a subset of a transaction database. When mining for such patterns, it is quite often that we come up with a large number of rules that appear to be too specific and not very interesting. A rule that relates two specific products in a market basket database is not very likely to be really strong compared to a rule that relates two groups or two families of products. Hierarchical relationships among items in a database can be used in order to aggregate the weak, lower-level rules into strong, higher-level rules, producing hierarchical, multiple level, or generalized association rules. However, such information is not always explicitly provided, although it might exist. Mining for taxonomies is a really challenging task that, to the best of our knowledge, has not been approached yet. Taxonomies are conceptual
منابع مشابه
On the Discovery of Mutually Exclusive Items in a Market Basket Database
Mining a transaction database for association rules is a particularly popular data mining task, which involves the search for frequent co-occurrences among items. One of the problems often encountered is the large number of weak rules extracted. Item taxonomies, when available, can be used to reduce them to a more usable volume. In this paper we introduce a new data mining paradigm, which invol...
متن کاملMining for Mutually Exclusive Gene Expressions
Association rules mining is a popular task that involves the discovery of co-occurences of items in transaction databases. Several extensions of the traditional association rules mining model have been proposed so far, however, the problem of mining for mutually exclusive items has not been investigated. Such information could be useful in various cases in many application domains like bioinfor...
متن کاملMining Cross-Transaction Web Usage Patterns
Web Usage Mining is the application of data mining techniques to large web log databases in order to extract usage patterns. However, most of the previous studies on usage patterns discovery just focus on mining intra-transaction associations, i.e., the associations among items within the same user’s transactions. A cross-transaction association rule describes the association relationships amon...
متن کاملMining Multiple-level Association Rules Based on Pre-large Concepts
The goal of data mining is to discover important associations among items such that the presence of some items in a transaction will imply the presence of some other items. To achieve this purpose, Agrawal and his co-workers proposed several mining algorithms based on the concept of large itemsets to find association rules in transaction data (Agrawal et al., 1993a) (Agrawal et al., 1993b) (Agr...
متن کاملProbabilistic Frequent Pattern Growth for Itemset Mining in Uncertain Databases
Frequent itemset mining in uncertain transaction databases semantically and computationally di ers from traditional techniques applied on standard (certain) transaction databases. Uncertain transaction databases consist of sets of existentially uncertain items. The uncertainty of items in transactions makes traditional techniques inapplicable. In this paper, we tackle the problem of nding proba...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007